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Abstract:  We  consider  the  task  of  numerically  approximating  the  solution  of  an 

ordinary  differential  equation  initial  value  problem.  A methodology  is  given  for 
determining  the  computational  complexity  of  finding  an  approximate  solution  with  error 
not  exceeding  t.  In  addition,  we  determine  the  method  of  optimal  order  within  a given 
class  of  methods,  and  show  that  under  reasonable  hypotheses,  i uptime!  order 
increases  as  s decreases,  tending  to  infinity  as  t tends  to  zero. 
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1.  Introduction 


With  few  exceptions,  past  work  in  analytic  computational  complexity  has  focused 
on  the  problem  of  finding  a zero  of  a (nonlinear)  transformation  of  Banach  spaces;  in 
most  work,  this  problem  is  specialized  to  that  of  finding  a zero  of  an  operator  on  a 


finite-dimensional  real  or  complex  vector  space  (and  in  much  of  this  work,  the  problem 
is  further  specialized  to  the  one-dimensional  case).  Much  has  been  discovered  about 
the  computational  aspects  of  iterative  schemes  for  the  solution  of  such  problems, 
especially  in  the  areas  of  minimal  complexity  (e.g.,  Kung  and  Traub  [73],  Traub  and 
Wofniakowski  [76])  and  maximal  order  (e.g.,  Kung  and  Traub  [74],  Woiniakowski  [75]). 

In  this  paper,  we  will  consider  another  topic  in  analytic  computational  complexity 
theory,  that  of  finding  complexity  bounds  for  the  numerical  solution  of  ordinary 
differential  equation  initial-value  problems  on  a fixed  interval.  We  will  not  be 
interested  in  questions  of  the  existence  and  the  uniqueness  of  the  solutions  to  such 
problems.  In  fact,  we  will  restrict  our  discussion  of  the  application  of  general  results 
to  the  case  where  the  unique  solutions  to  these  problems  are  analytic  functions.  (The 
techniques  described  in  this  paper  are  applied  to  several  well-known  classes  of 
methods  in  Werschulz  [76a],  [76b]) 

We  will  limit  ourselves  here  to  classes  of  one -step  methods  for  the  numerical 
solution  of  these  problems;  in  terms  of  informational  usage,  these  methods  are 


analogous  to  iterative  zero-finding  methods  without  memory  (Traub  [64],  [72]). 


Analogous  to  the  one-point  iterative  methods  with  memory  are  the  multistep  methods 
for  initial-value  problems;  these  methods  will  be  dealt  with  in  a future  paper. 

Our  approach  will  be  to  assume  that  an  initial-value  problem  is  given,  along  with 


2 


some  error  criterion  <,  where  0 < i < 1;  we  then  wish  to  compute  an  approximate 
solution  with  error  no  greater  than  «.  Two  basic  questions  concern  us: 

(1.)  For  any  given  method,  what  is  the  complexity  of  solving  this  problem? 

(2.)  Given  any  "basic"  sequence  of  methods  with  increasing  order,  which  method 
has  minimal  complexity? 

In  Section  2,  we  describe  a methodology  that  handles  both  questions  for  classes 
consisting  of  methods  whose  error  functions  have  a special  form.  Furthermore,  we 
find  that  within  such  a basic  sequence  of  methods,  the  following  hold  under  very 
general  conditions: 

(1.)  For  any  i,  there  is  a unique  choice  of  order  and  step  size  minimizing  the 
complexity. 

(2.)  As  « decreases,  both  the  optimal  order  and  the  complexly  increase 
monotonically,  tending  to  infinity  as  i tends  to  zero. 

Furthermore,  within  many  classes  of  problems  and  methods,  the  "penalty"  <e.g.,  the 

amount  the  cost  curve  turns  near  the  optimum)  associated  with  using  non-optimal 

order  tends  to  infinity  as  t tends  to  zero. 

These  conclusions  are  an  interesting  contrast  to  Known  results  on  zero-finding 
via  iterations  without  memory.  The  latter  results  tend  to  support  the  "folklore"  idea 
that  it  is  "better"  to  use  a low-order  method  many  times,  than  to  use  a high-order 
method  a few  times.  In  the  one-point  case,  optimal  order  is  low,  while  in  the  multipoint 
case,  optimal  order  increases  with  the  problem  complexity  'but  with  little  penalty  for 
using  a method  of  non-optimal  order)  (Kung  and  Traub  [73]).  In  addition,  optimal  order 
for  these  problems  does  not  depend  on  the  error  criterion;  it  is  computed  for  the 


One  may  wonder  why  there  is  this  discrepancy  between  the  results  for  the 
initial-value  problem  and  those  for  the  zero-finding  problem,  since  any  initial-value 


problem  may  be  written  as  an  operator  equation,  as  in  Stetter  [73}  The  reason  for 
this  is  that  the  methods  used  for  the  two  problems  differ  greatly—those  for  the  initial- 
value  problem  compute  estimates  for  solution  values  at  new  points  by  discretization. 
while  those  for  zero-finding  compute  improved  estimates  for  the  zero  of  a function  by 
iteration. 

In  Section  3,  we  discuss  the  extension  of  these  results  to  classes  consisting  of 
methods  whose  error  functions  are  somewhat  more  complicated  than  those  considered 
in  Section  2. 

In  Section  4,  we  introduce  the  notions  of  normality  and  order -convergence  for  a 
basic  sequence  of  one-step  methods.  We  prove  that  they  are  equivalent  under  certain 
circumstances.  A basic  sequence  of  methods  enjoying  these  properties  is  very  easy  to 
deal  with  in  many  respects,  especially  when  one  is  interested  in  comparing  upper  and 
lower  complexity  bounds  for  such  a class. 

Finally,  in  Section  5,  we  describe  some  numerical  data  that  support  the  above 
theoretical  results.  In  particular,  these  data  seem  to  indicate  that  even  for  modest 
values  of  t,  there  are  considerable  savings  in  using  methods  of  optimal,  rather  than 
fixed,  order. 
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2.  Optimality  Within  a Strong  Basic  Sequence 

We  are  interested  in  the  numerical  solution  of  a class  of  ordinary  differential 
equation  initial-value  problems  on  a fixed  interval  I of  finite  length;  we  take  I • [0,  1] 
without  loss  of  generality.  More  precisely,  let  3)  be  a set  of  initial-value  points  in  the 
real  N-dimensional  linear  space  RN  , and  let  1$  be  a set  of  operators  on  R^ , such  that 
the  initial-value  problem  of  finding  a function  x:  I -»  R^  satisfying 

x(t)  - v(x(t)>  if  t « int  I , 

(2.1) 

x(0)  - x0 

has  a unique  solution  for  every  (xq  , v)  < J)x^.  The  autonomous  form  of  this  system 
is  no  restriction,  since  any  non -autonomous  system  may  be  made  autonomous  by 
increasing  the  dimension  of  the  system  by  one. 

The  model  of  computation  to  be  used  is  fairly  general.  We  assume  only  that  all 
arithmetic  operations  are  performed  exactly  in  R (i.e.,  infinite-precision  arithmetic), 
and  that  for  any  algorithm  to  be  considered  for  the  solution  of  (2.1),  a set  of 
procedures  is  given  for  the  computation  of  any  information  about  v required  by  that 
algorithm.  (For  instance,  with  Runge-Kutta  methods,  we  must  be  able  to  compute  v at 
any  point  in  its  domain.) 

In  this  paper,  we  are  interested  in  the  numerical  solution  of  (2.1)  via  one-step 
methods,  using  an  equidistant  grid  as  defined  in  Stetter  [73].  (We  limit  ourselves  to 
equidistant  grids  in  order  to  facilitate  the  comparison  of  methods  of  different  orders; 
the  other  extreme  is  taken  by  Lindberg  [74],  who  considers  the  problem  of  picking  an 
optimal  grid  for  a given  method  of  fixed  order.)  Thus  the  methods  considered  will 
generate  approximations  Xj  to  x(tj)  by  the  recursion 

Xj+i  - xj  ♦ h *<xj , h)  (0  s i s n - 1 , n ■ h"1) , 


(2.2) 
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where  h is  the  step-size  and  p is  the  increment  function  for  the  method  (Henrici  [62]h 
for  briefness,  we  will  refer  to  "the  method  Despite  the  fact  that  ?(x;  , h)  will 
depend  on  some  information  about  v,  we  will  not  explicitly  indicate  this  dependence. 

Thus,  the  method  f produces  an  approximation  to  the  true  solution  of  (2.1).  We 
want  to  measure  the  discrepancy  between  the  approximate  and  true  solutions.  Various 
error  measures  have  been  introduced  in  the  literature.  These  include  the  local 
truncation  error  per  step,  the  local  truncation  error  per  unit  step,  and  the  global  error; 
see  Henrici  [62]  or  Stetter  [73]  for  definitions.  These  error  measures  may  be  either 
absolute  or  relative  (in  the  usual  sense);  they  may  be  measured  either  at  the  endpoint 
of  the  interval  (as  in  Henrici  [62],  Hindmarsh  [74])  or  over  the  entire  grid  (as  in 
Sandberg  [67],  Lindberg  [74]).  There  has  been  a great  deal  of  discussion  of  which 
error  criterion  is  the  best  one  to  use;  for  instance,  Gear  [71]  (Section  9.3)  uses  local 
error  per  step,  while  Hull  et  al.  [72]  use  local  error  per  unit  step.  We  take  no  sides  in 
this  discussion,  since  any  of  these  error  measures  may  be  used  in  the  analysis  to 
follow. 

Before  proceeding  any  further,  we  will  establish  some  notational  conventions. 
Let  X be  an  ordered  ring;  then  X*  and  X**  will  respectively  denote  the  nonnegative 
and  positive  elements  of  X.  (This  will  be  used  in  the  cases  X • F,  the  real  numbers, 
and  X - Z , the  integers.)  The  symbol  means  "is  defined  to  be,"  while  "a"  means 
"is  identically  equal  to."  If  xi»  X2:  R ■*  R and  w:  R2  -»  It  are  differentiable,  then  for 
I - 1,  2,  we  will  write 

dj  *txi<t>.x2(t» 

for  the  result  of  differentiating  u(xiiX2)  respect  to  Xj»  and  then  substituting 
XI  - xi<t).  x2  ■ X2W-  We  use  the  notations  "x  4 a"  and  "x  t a"  to  indicate  one-sided 
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limits  as  in  Buck  [65].  Finaly,  we  shall  write  "(a.b)c"  to  fnidicate  the  part  of 
equation  (a.b),  as  in  Gurtin  [75]. 

Now  we  are  prepared  to  define  our  problem.  Let  3)  and  ^ be  as  above; 
consider  a problem  (xq,v)  in  5)*'$.  Let  * be  a class  of  one-step  methods,  and  let 
9:  *xl  -»  |R+  satisfying  lim  e(*,h)  - 0 be  a given  function  that  will  serve  as  an 
error  measure.  Choose  an  error  criterion  « satisfying  the  technical  restriction  0 < • < 
1.  We  then  wish  to  answer  two  questions: 

(1.)  Given  ? « *,  how  may  we  pick  h < I such  that 
(2.3)  *(e,h)  £ i, 

and  what  is  the  complexity  of  the  process  defined  by  p and  h? 

(2.)  How  may  one  choose  among  all  (p,h)  ( **I  such  that  (2.3)  holds,  that  pair 
(e*,h*)  giving  minimal  complexity? 

In  order  to  get  useful  bounds  on  e(^,h),  it  is  necessary  to  introduce  the  concept 
of  order.  In  this  section,  we  will  use  a highly  restricted  definition,  which  we  will  relax 
in  Section  3.  Let  ♦ - {^p:  p ( I**},  and  suppose  that  there  is  an  analytic  function 
n:  IR  + -»  R + such  that  lim  p_>o  *(p)^p  exists  and  is  nonzero  and 

(2.4)  v(? p,h)  - «(p)  hp  for  h ( I and  p ( 1 ++  . 

Then  *p  is  said  to  have  strong  order  p with  respect  to  9,  and  # is  said  to  be  a strong 
basic  sequence.  (Although  the  error  coefficient  9 will  generally  depend  on  the  solution 
x of  (2.1),  we  do  not  explicitly  indicate  this  dependence.)  Note  that  the  order  of  a 
method  depends  on  the  error  measure;  for  example,  the  order  with  respect  to  the  local 
error  per  step  is  one  greater  than  that  with  respect  to  the  local  error  per  unit  step  or 
the  global  error. 

Equation  (2.4)  is  somewhat  more  restrictive  than  that  which  is  usually 
encountered  in  practice;  more  often,  we  expect  x to  depend  on  h.  We  consider  the 


extension  of  our  results  to  this  case  in  the  next  section. 
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We  now  are  able  to  measure  the  complexity  o*  computing  an  approximate 
solution  to  (2.1),  with  error  not  exceeding  «,  using  a strong  basic  sequence  *.  Indeed, 
(2.4)  implies  that  a necessary  and  sufficient  condition  for  #(*p,h)  - i is  that 

(2.5)  h - h(p,«)  «(p)_1/P  e"*A> , 

where 

(2.6)  a In  (a-1)  . 

(Note  that  since  0 < « < 1,  we  have  a < R4"f.)  Thus,  the  number  of  steps  needed  is 
given  by 

(2.7)  n h-1  - xtp)1^  e“/P  . 

(Note  that  n (as  given  by  (2.7))  need  not  be  an  integer.  But  this  poses  no  essential 
difficulty;  see  (e.g.)  Traub  and  Wofniakowski  [76].)  Next,  suppose  that  there  exists  an 
analytic  function  c:  R + -»  R+  such  that  c(p)  is  the  cost  per  step  associated  with  the 
method  <pp.  Finally,  we  assume  that  the  cost  per  step  does  not  vary  from  step  to  step; 
for  the  classes  of  methods  we  consider,  this  means  only  that  we  assume  that  the  cost 
of  evaluating  v (or  its  derivatives)  does  not  depend  on  the  point  of  evaluation.  Thus 
the  complexity  C(p,a)  of  solving  (2.1)  to  within  an  error  criterion  « ■ e~a  is  simply 
given  by 

(2.8)  C(p,«)  - n c(p)  - f(p)e“/P, 
where  we  define  f:  R+  -»  R+  by 

(2.9)  f(p)  irtp^/Pdp)  . 

We  now  turn  to  the  question  of  picking  for  each  a ( R++  that  order  p giving 
minimal  complexity.  In  the  analysis  to  follow,  we  will  drop  the  restriction  that  p must 
be  an  integer.  However,  we  will  recover  optimality  over  the  integers  from  optimality 
over  the  real  numbers  in  Corollary  2.1  . Without  loss  of  generality,  we  assume  that 

(2.10)  p>0  implies  f(p)  > 0 . 
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(If  there  were  a p > 0 with  f(p)  - 0,  use  of  the  method  would  yield  a solution  with 
zero  complexity,  i.e.,  "with  no  effort.")  In  addition,  we  assume  that 

(2.11)  •imptoo  f(P)  ■ +0°  • 

By  (2.9),  this  assumption  maybe  viewed  as  a simple  consequence  of  two  conditions, 
both  of  which  are  quite  natural.  The  first  is  that  limpid  c(p)  - ♦a>;  the  "better"  a 
method  is  (i.e.,  the  higher  its  order  is),  the  more  we  should  expect  to  pay  for  its  use. 
The  second  condition  is  that  if  lim  pf^,  «(p)  ■ 0 , then  there  must  exist  af(I  such  that 
*(p)  2.  0 p for  p sufficiently  large.  (For  example,  in  the  class  of  Taylor  series  methods, 
using  the  worst-case  local  error  per  unit  step  as  the  error  measure,  this  second 
condition  would  follow  from  the  assumption  that  any  problem  (xq,v)  f 2>x*$  must  have 
an  analytic  solution.) 

Thus  in  order  to  find  a minimum  for  C(  • ,«),  we  merely  differentiate  (2.8)  with 
respect  to  p,  finding 

(2.12)  dj  C(p,«)  - p-2  f(p)  e«/P  [G(p>  - «] , 
where  G:  R ++  -»  IR  is  given  by 
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(2.13)  G(p)  p2  f'(p)/f<P>  . 

Thus  a necessary  condition  that  p be  a minimum  for  C(  • ,a)  is  that  dj  C(p,a)  • 0,  i.e., 

(2.14)  G(p)  • a . 

Sufficient  conditions  for  the  existence  and  uniqueness  of  a p satisfying  (2.14)  and 
minimizing  C(  • ,«)  are  given  in 
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r.~-r,-^r 


Theorem  2.1:  Let  f satisfy  (2.10)  and  (2.11).  Suppose  also  that 


(2.15) 


G'(p)  > 0 whenever  G(p)  > 0. 


Then  there  is  a function  p*:  IR++  -*  R++  such  that  (2.14)  holds  if  and  only  if  p - p*(«). 


Moreover,  for  all  p ( IR  **, 
(2.16) 


C*(a)  C(p*(e),«)  S C(p,«), 


with  equality  holding  if  and  only  if  p - p*(«). 

(Since  p*(«)  satisfies  (2.16),  we  call  p*(«)  the  optimal  order.  C*(«)  the  optimal 


<itv.  and 


(2.17) 


h*(a)  h(p*(«),e) 


the  optimal  step-size.) 


Proof  of  Theorem  2.1:  If  we  write  the  Maclaurin  series  of  f and  substitute  it 


into  (2.13),  it  is  easy  to  see  that 
(2.18) 


,impiO  g<p)  ■ °- 


We  now  claim  that 


(2.19)  l'mptoo  G(p)  - +oo. 

Indeed,  since  (2.11)  holds  there  is  a pq  > 0 such  that  f'(pg)  > 0,  i.e.,  G(pq)  > 0.  Thus 
by  (2.15),  G is  monotone  increasing  on  [pg,  +oo),  and  hence  either  (2.19)  holds  or  there 
exists  a y > 0 such  that  liMpfoo  G(p)  * T W the  latter  holds,  then  G is  bounded,  and 


we  have 


f'(t)/f(t)  S «t‘2  (1  S t < +oo) 


for  some  l > 0}  integrating  the  above  inequality  over  1 S t i p yields 

f(p)  S f(l)e,(1  ■ Vp), 

so  that  limpid  f(p)  s f(l)  e*,  contradicting  (2.11).  Thus  (2.18)  and  (2.19)  hold; 
together,  they  imply  that  for  any  a > 0,  there  is  a choice  of  p such  that  (2.14)  holds. 
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Suppose  that  for  some  o > 0,  there  were  two  numbers  pQ  < pj  with  G(pq)  - 
G(pj>  - a.  Then  by  Rolle’s  Theorem,  there  is  a p£  between  pq  and  pj  with  G/(p2>  • 0, 
contradicting  (2.15).  Thus  for  each  a > 0,  there  is  a unique  choice  of  p such  that  (2.14) 
holds;  we  denote  this  choice  by  p*(a). 

To  prove  (2.16),  differentiate  (2.12)  with  respect  to  p to  find 
(2.20)  d^ap,*)  - p"2  f(p)  e“/P  G'(p)  + [G(p)  - «]  (d/dp)  [p-2  f(p)  e“/P]  . 

But  upon  substituting  p «=  p*(o),  the  second  term  in  (2.20)  vanishes  and  the  first  term 
is  positive;  so  we  have 

C(p*(a),a)  > 0 . 

Thus  p*(a)  gives  a local  minimum  for  C(  • ,a),  which  has  only  one  critical  point  (since 
(2.14)  has  a unique  solution)  and  (2.16)  follows.  | 

Note  that  we  have  not  said  that  p*(a)  is  an  integer;  in  fact,  this  need  not  be  true 
in  general.  Since  the  basic  sequence  ♦ is  indexed  by  Z+\  we  have  not  yet  solved  the 
problem  of  choosing  from  among  all  (y>p,h)  such  that  (2.3)  holds,  that  pair  yielding 
minimal  complexity.  This  problem  is  solved  by 

Corollary  2.1;  For  any  or  > 0,  define  p**(a)  < Z ++  to  be  that  element  of  the  set 
{LP*<«)J  » rP*(«)l}  which  gives  the  smaller  value  of  C(  1 ,at).  Then 

C(p**(a),or)  S C(p,a)  for  p < Z ++  , 
with  equality  if  and  only  if  p = p**(«). 

Proof:  Clearly  we  need  only  consider  the  case  where  p*(o)  is  not  an  integer. 
Suppose  there  exists  pq  < Z ++,  not  equal  to  p**(«),  with  C(pQ,a)  < C(p**(o),«).  Without 
loss  of  generality,  assume  pq  < Lp*(«)J.  Then  C(p0,«)  £ C([p*(a)J,«)  i C(p*(o),o),  which 
implies  that  there  is  a pj  i (p0  , p*(a))  such  that  dj  C(p1(a)  - 0.  Hence,  G(pp  - a,  but 
Pl  ? p*(a).  This  contradicts  Theorem  2.1.  | • 
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It  may  be  readily  verified  that  the  hypotheses  of  Theorem  2.1  are  satisfied  for 
many  classes  of  functions  f.  Some  of  these  are 
logarithmic:  f<p)  - In  (p  + e) , 
monomial:  f(p)  - pm  (m(R++), 
exponential:  f(p)  - flP  (fi  > 1) , 
super-exponential:  f(p)  - pP  , and 
hyper -exponential:  f(p)  - 0PP  . 

(We  write  "In  (p  + e>",  where  e is  the  base  of  the  natural  logarithms,  rather  than  "In  p" 
as  a technical  convenience.  However,  an  expression  of  the  form  "In  (p  ♦ *)"  with  y > 0 
is  necessary  to  guarantee  that  f(l)  > 0.)  Furthermore,  we  find  that  if  f has  the 
monomial-logarithmic  form 

f(p)  - pa(ln(p+e))b  (a,b«R*+), 

then  the  hypotheses  of  Theorem  2.1  hold.  This  may  be  verified  either  directly,  or  by 
using  the  following  Lemma,  along  with  the  fact  that  the  hypotheses  hold  for  f(p)  - p 
and  f(p)  - In  (p  + e). 

Lemma  2.1:  Let  f have  the  form 

f(p)  - a n!^  (fj(p))ri  , 

where  a c IR++',  and  for  each  i (1  s i < m),  fj  satisfies  the  hypotheses  of  Theorem  2.1 
and  Tj  < R **.  Then  f satisfies  the  hypotheses  of  Theorem  2.1. 

Proof:  It  is  clear  that  if  each  fj  satisfies  (2.10)  and  (2.11),  then  so  does  f.  If 
each  fj  yields  (via  (2.13))  a Gj  satisfying  (2.15),  then  f yields  a G in  the  form 

«P>  - Xjj  rjGjtp) , 

and  so  it  is  clear  that  G satisfies  (2.15).  | 

For  the  important  methods  of  practical  interest,  we  will  only  be  interested  in 
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monomial  and  monomial-logarithmic  growth;  see  Werschulz  [76a],  [76b].  We  include  the 
other  examples  of  functions  that  satisfy  the  hypotheses  of  Theorem  2.1  to  illustrate 
the  wide  variety  of  functions  that  qualify. 

So  we  have  seen  that  under  the  hypotheses  of  Theorem  2.1,  there  is  a unique 
choice  of  order  and  step  size  minimizing  the  total  complexity  for  any  error  criterion. 
What  happens  to  these  choices  as  a changes? 

Theorem  2.2:  Let  f satisfy  the  hypotheses  of  Theorem  2.1.  Then 
(1.)  p*(a)  and  C*(«)  increase  monotonically  with  a. 

(2.)  lim^oo  p*(a)  - limatoo  C*(«)  - +oo. 

(3.)  If  there  exists  M > 0 such  that  *(p)^p  i M for  all  p,  then 
lim  infaf00  h*(a)  > 0 if  a/p*(a)  is  bounded  as  etoo. 

Proof:  To  prove  (1.),  note  that  p*  is  the  functional  inverse  of  GL  Thus  p*'(«)  ■ 

G'(p*(«))-*  > 0,  so  that  p*(«)  increases  with  a.  Now  use  the  chain  rule: 

C*'(«)  - C(p *(«),«)  p*'(«)  + d2  C(p *(<*), a). 

But  the  first  term  on  the  right-hand  side  vanishes  by  the  definition  of  p*(«).  So 

C*'<«)  - d2  C(p*(«),«)  - (p*^))-1  f(p*(«»  ea/P*(B>  > 0 

and  C*(«)  increases  with  a. 

Suppose  that  lim  p*(a)  / +oo  . Since  p*(«)  increases  monotonically  with  «, 
there  is  an  L > 0 such  that  lim^^  p*(e)  - L.  So  (2.14)  implies  that 
G(L)  - I'mgfoo  G(p*(«))  - - +«• , 

contradicting  the  continuity  of  G.  This  proves  the  first  part  of  (2.)  . Now  for  any 
a > 0,  we  have 

C*(«)  - f(p*(«))  e*/P*(a)  > f(pV). 

Let  at  Co-,  then  (2.1 1)  and  the  first  part  of  (2.)  imply  that  the  second  part  of  (2.)  holds. 

To  prove  (3.),  let  such  an  M > 0 exist,  so  that  [h*(«)]“^  S M e*/p  ^m\  Then  we 
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see  that  lim  inf^^  h*(«)  > 0 if  [h*(«)]"^  is  bounded  as  «too,  which  itself  is  true  If 
a/p*<«)  is  bounded  as  ortoo.  | 

Therefore  under  a very  general  set  of  conditions,  we  see  that  the  more 
accuracy  we  want  in  our  computed  solution,  the  greater  its  complexity  becomes.  Of 
course,  this  is  just  what  we  would  expect.  What  is  somewhat  surprising  is  that  the 
minimal  complexity  is  obtained  by  letting  the  order  p increase  as  the  error  • 
decreases,  with  p increasing  without  bound  as  t tends  to  zero.  Moreover,  the  last  part 
of  the  theorem  says  that  not  only  should  the  order  be  increased  when  trying  to  obtain 


a more  accurate  solution,  but  that  it  may  actually  turn  out  that  the  step-size  should 
not  be  allowed  to  tend  to  zero.  In  addition,  it  is  clear  that  the  proof  of  the  result 
concerning  the  limiting  behavior  is  valid,  provided  that  we  only  assume  that  f is 
continuously  differentiable  on  the  positive  real  axis  and  that  p*(«)  tends  to  a (possibly 
infinite)  limiting  value  as  s t oo. 

We  now  determine  whether  we  are  saving  a great  deal  by  using  the  optimal- 
order  method.  This  may  be  thought  of  in  several  ways;  we  will  consider  how  sharply 
the  cost  curve  turns  at  the  optimum,  the  cost-difference  between  using  a method  of 
fixed  order  and  a method  of  optimal  order,  and  the  cost-ratio  of  a fixed-order  method 
to  an  optimal-order  method.  We  will  show  that  under  certain  reasonable  conditions,  all 
of  these  measures  tend  to  infinity  with  a. 

How  sharply  the  cost  curve  turns  at  the  maximum  is  measured  by  C(p*(«),«r). 
If  we  consider  five  of  the  growth  models  mentioned  above  (e.g.,  monomial,  monomial- 
logarithmic,  exponential,  hyper-exponential,  and  super-exponential),  we  find  that 
C(p*(e),a)  is  monotone  increasing  for  a sufficiently  large,  and  tends  to  infinity  with 
a,  with  but  one  exception;  in  the  case  of  “linear  growth"  (f(p)  ■ p),  we  find  that 


14 


dj2  C(p*(«),«)  » e.  However,  in  the  classes  of  algorithms  we  study,  the  case  f(p)  - p 
does  not  arise,  provided  that  we  include  Hcombinatory  cost"  (defined  as  in  Kung  and 
Traub  [74])  in  our  complexity  measure.  Thus  in  general,  we  find  that  the  "pointedness" 
of  the  cost  curve  near  the  minimum  increases  without  bound  as  stoo . 

Next,  we  will  show  that  for  an^  f satisfying  the  hypotheses  of  Theorem  2.1,  the 
difference  in  complexity  between  using  a method  of  fixed  order  and  a method  of 
optimal  order  tends  to  infinity  with  a. 

Proposition  2.2:  For  any  fixed  pg  < such  that  G'(pg)  i 0, 
lim«tco  [C<Pg.«)  - C*<«>]  - . 

Proof:  Pick  a so  large  that  p*(«)  > pg,  and  let  pg  < p < p*(«).  If  we  write  out 
the  partial  derivative  in  the  last  term  of  (2.20),  we  find  that 

dj2  C(p,«)  - p"2  f(p)  e“/P  G'(p)  + p’4  [«  - G(p)]  [<«  «■  2p)  - G(p)]  f(p)  . 

Since  pg  < p*(«),  we  have  G(p)  < or;  it  then  follows  that  dj2C(p^r)  is  positive  and 
bounded  away  from  zero  as  a tends  to  infinity.  Since 

C(pg,«)  - C%r>  - dj2  C(p,«r>  [pg  - p*(a)f  / 2 
for  some  p between  pg  and  p*(o),  the  result  follows.  | 

As  for  the  cost-ratio,  a simple  calculation  shows  that 

,im«too  C<P0.«>/C*(«>  " +0° 

in  all  of  the  examples  given  above.  Thus  there  are  a number  of  ways  in  which  we 
incur  a large  additional  cost  by  not  using  the  optimal  order. 

One  may  wonder  whether  the  result  that  optimal  order  increases  and  tends  to 
infinity  with  a is  "reasonable."  One  way  of  determining  this  is  to  examine  actual 
numerical  tests)  we  cite  Hull  et  al.  [72]  as  a well-known  example.  Since  we  are  only 


dealing  with  methods  of  fixed  order,  our  theory  does  not  attempt  to  handle  methods 
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such  as  Bulirsch-Stoer,  Krogh,  or  Gear.  However,  let  us  look  at  the  results  of  Hull  et 
al.  for  the  Runge-Kutta  methods.  Even  though  there  are  only  three  methods  (of  orders 
four,  six,  and  eight)  and  three  error  criteria  («  ■ 10*®,  10"®,  and  10*®),  Table  1 in  Hull 
et  al.  [72]  indicates  that  the  optimal  order  does  increase  as  « decreases.  (We  give 
more  extensive  numerical  data  in  Section  5.) 

Finally,  we  note  that  the  restriction  that  the  grid  be  equidistant  may  be 
weakened  somewhat,  provided  that  we  use  a local  error  measure.  Indeed,  let  I be 
partitioned  as  I - Ij  u ...  u I|_,  and  now  assume  that  we  use  a grid  that  is  equidistant  on 
each  subinterval  Ij,  ...  , I|_.  Then  the  total  complexity  is  given  by  the  sum  of  the 
complexities  of  all  subintervals 

C(pj,...,Pj_,e)  •"  j Cj(pj,«), 

where  we  set 

Cj(p,«)  fj(p,«)e®/P  , fj(p):-«i(p)1/Pc(p)j 

here  *j(p)  is  the  error  constant  of  *p  on  Ij.  Since  we  use  a local  error  measure,  we 
find  that  C(pj,...,pL,a)  is  minimized  by  choosing  each  Pj  to  minimize  Cj(  • , a).  Thus  the 
earlier  results  apply;  in  particular,  if  we  define  p*(a)  to  be  the  optimal  order  on  I),  we 
find  that  if  fj  satisfies  (2.10),  (2.11),  and  (2.15),  then  Pj*(a)  increases  and  tends  to 
infinity  with  a. 
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3.  Optimality  Within  a Basic  Sequence 

There  ere  two  difficulties  with  the  approach  taken  in  Section  2.  The  first  has 
already  been  mentioned--we  generally  expect  the  error  coefficient  to  depend  on  the 
step-size.  The  second  is  based  on  the  fact  that  there  are  a large  number  of  p^-order 
methods  of  a given  type,  and  we  wish  to  use  the  best  method  possible.  In  theory,  this 
would  involve  finding  a p**1 -order  method  with  minimal  cost  per  step.  In  practice,  this 
is  not  often  possible;  there  is  a gap  between  the  minimal  cost  theoretically  possible 
and  the  cost  of  the  best  method  known.  So  we  now  consider  the  extension  of  the 
results  in  Section  2 to  a more  general  setting,  which  will  take  these  two  difficulties 
into  account. 

We  first  refine  our  notion  of  order.  Let  »:  ♦><!  -*  R+  be  an  error  measure, 
where  $ - [*p:  p < Z **}  is  a class  of  one-step  methods,  and  suppose  that  a function 
*:  IR  +xl  -»  I?  + and  analytic  functions  » *U  : R + such  p-*0  *L^^P 

and  lim  p_,Q  *jj(p>^/P  exist  and  are  nonzero  and 

(3.1)  »<*p,h)  - «(p,h)hP  for  h < I and  p < I * * , 

where 

(3.2)  0 < *|_(p)  * «(Pih)  £ «|j(p)  < +oo  for  h t I . 

Then  ^p  is  said  to  have  order  p with  respect  to  r,  and  * is  said  to  be  a basic 
sequence  (as  in  Traub  [64]);  *(p,h)  is  said  to  be  the  error  coefficient  of  *p.  (Here  we 
introduce  the  convention  of  attaching  the  subscripts  "L"  and  "U"  to  quantities  that 
refer  to  lower  and  upper  bounds  on  complexity,  respectively.) 

This  definition  of  order  is  similar  to  that  in  Cooper  [69]  and  Cooper  and  Verner 
[72],  except  that  we  include  a lower  bound  «L(p)  on  «(p,h);  this  lower  bound  is 
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necessary  and  sufficient  to  guarantee  that  the  order  of  a method  is  well-defined.  Note 
that  this  definition  makes  sense  for  all  values  of  h « I;  thus,  it  is  non-asvmptotic  in  that 
we  do  not  require  h i 0 in  order  for  it  to  make  sense.  Clearly,  a strong  basic 
sequence  is  a basic  sequence;  hence,  the  definition  of  order  is  an  extension  of  the 
definition  of  strong  order  given  in  Section  2.  Finally,  note  that  the  order  depends  on 
the  choice  of  the  error  measure  a;  for  instance,  the  order  with  respect  to  the  local 
error  per  step  exceeds  that  with  respect  to  the  local  error  per  unit  step  by  one. 

We  next  discuss  the  notion  of  cost  per  step.  As  pointed  out  above,  we  will 
generally  have  only  bounds  on  the  cost  c(p)  required  per  step  of  a given  p^-order 
method: 

(3.3)  cL(p)  £ c(p)  £ cu(p). 

That  is,  c^(p)  is  a lower  bound  on  the  minimum  possible  cost  per  step,  usually  derived 
via  theoretical  considerations,  and  Cjj(p)  is  an  upper  bound  on  the  minimum  possible 
cost  per  step,  which  is  derived  by  exhibiting  an  algorithm  for  computing  *p.  (In  what 
follows,  we  shall  assume  that  cl  , cy  : R + -*  R + are  analytic  functions.) 

We  now  wish  to  give  bounds  on  C(p,a),  the  complexity  of  finding  an  approximate 
solution  of  (2.1)  using  the  method  *p,  such  that  r(?p,h)  £ e"“.  Suppose  that  (2.3) 
holds.  Then  by  (3.1)  and  (3.2),  we  must  have 

(3.4)  «L(p)hP  S e_a,  i.e.,  h s hL(p,«)  «L(p )-l/Pe-«/P  . 

Hence,  the  number  of  steps  n - h'*  must  satisfy 

(3.5)  n & ^(p)1^  e«/P  . 

Defining  (a6  in  Section  2) 

(3.6)  C(p,a)  n c(p) 

(i.e.,  total  complexity  equals  number  of  steps  required  multiplied  by  cost  required  per 
step),  (3.3)  and  (3.5)  imply  that 
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(3.7)  C(p,«)  i CL(p,«)  fL(p)  •*/»>, 

where 

(3.8)  f |_(p>  «|_(p)1/,p  cl(p)- 

That  is,  regardless  of  the  algorithm  used  to  compute  *p,  the  total  complexity  of  finding 
jq  approximate  solution  Qf  (2.1)  must  exceed  Cl(p,«). 

On  the  other  hand,  we  find  that  in  order  to  use  *p  to  find  such  an  approximate 
solution,  it  suffices  (by  (3.1)  and  (3.2))  to  take 

(3.9)  «lKp)  hp  - e'*,  i.e.,  h - hy(p,«)  e~*!p  . 

so  that  we  need  only  take  n steps,  where 

(3.10)  n - xyfp)1^  e*/p  . 

(As  in  Section  2,  the  value  of  n given  by  (3.10)  need  not  be  an  integer;  again,  this  is 
handled  as  in  Traub  and  Wofniakowski  [76].)  Thus  (3.3),  (3.6),  and  (3.10)  imply  that 

(3.11)  C(p,«)  S Cy(p,«)  fy(p)e«/p, 

where 

(3.12)  fy(p)  «u<p)1/p  cy(p). 

That  is,  there  exists  an  algorithm  for  computing  pp  such  that  the  total  complexity  of 
finding  an  approximate  solution  of  (2.1)  equals  Cy(p,«).  We  summarize  the  above 
results  in 

Theorem  3.1:  Let  C(p,«)  be  the  complexity  of  finding  an  approximate  solution  of 
(2.1),  using  the  method  ^p,  with  »(^plh)  < e“*.  Then 

(3.13)  CL(p,«)  < C(p,«)  S Cy(p,o), 

where  Cy  and  Cy  are  given  by  (3.7)  and  (3.11).  Moreover,  if  h ■ h(p,«)  is  the  maximal 
step-size  for  the  method  *p  such  that  r(?p,h)  s e"w,  then 

(3.14)  hy(p,ar)  S h(p,«)  S hL(p^r) . | 

Next,  we  consider  the  problem  of  optimality.  Define  the  optimal  complexity  by 
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(3.15)  C*(e)  inf  {C(p,«):  *p  < *}  . 

We  are  interested  in  bounds  for  C*(a).  These  are  derived  in 

Lemma  3.1:  Let  fL  and  fy  satisfy  (2.10)  and  (2.11),  and  suppose  that  fy  and  fy 
respectively  yield  (via  (2.15))  Gl  and  Gy  satisfying  (2.15).  Then  Gl  and  Gy  have 
respective  inverse  functions  Pl*,  Py*:  R++  -»  R++  such  that  for  all  p < R4"*-, 

(3.16)  Cl*(«)  :■  C|_(pL*(a),«)  i CL(p,a) 
and 

(3.17)  Cy*(«)  :-  Cy(py*(«),a)  <,  Cy(p,«)  , 

with  equality  in  (3.16)  (respectively,  (3.17))  if  and  only  if  p • pl*(o)  (respectively,  p - 
Py*(«». 

Proof:  This  is  an  immediate  corollary  of  Theorem  2.1.  | 

We  call  Pl*(«)  (respectively,  py*(«))  the  lower  (upper)  optimal  order.  Cj_*(«) 
(respectively,  Cy*(«))  the  lower  (upper)  optimal  complexity,  and 

(3.18)  hL*(o)  :-  hL(PL*(«),tt)  (respectively,  hy*(e)  :-  hy(py*(«),a)) 

the  lower  (upper)  optimal  step-size.  Combining  (3.13),  (3.15),  and  Lemma  3.1,  we  have 
Theorem  3.2:  Let  ^ and  fy  be  as  in  Theorem  3.1.  Then 
CL*(o)  < C*(«)  S Cy*(«).  | 

Note  that  if  we  define  p*(e)  by 

C(p*(«),«)  - C*(a) , 

we  can  make  no  statement  relating  p*(o),  pL*(a),  and  py*(«).  This  is  because  we  only 
have  bounds  for  C(p,a);  we  do  not  know  C(p,«)  itself.  In  fact,  it  is  important  to  realize 
what  p |_*(«)  and  py*(«)  tell  us.  First,  consider  py*(a).  We  can  achieve  a complexity  of 
Cy*(«)  by  using  a step-size  of  hy*(«),  along  with  the  method  of  order  Py*(e).  This  will 
give  optimal  complexity  within  the  sequence  of  algorithms  for  computing  9,  with  cost 
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per  step  of  *>p  given  by  Cy(p).  Next,  consider  pL*(«).  It  is  of  perhaps  theoretical 
rather  than  computational  interest,  in  that  we  cannot  compute  with  it.  What  does 
interest  us  is  C|_*(a),  since  it  limits  the  theoretical  improvement  in  Cy*(«).  Thus,  we 
are  interested  in  P|_*(«)  solely  as  a means  of  computing  Cl*(«). 

We  now  consider  behavior  of  these  quantities  as  a increases  and  tends  to 
infinity. 

Theorem  3.3:  Let  fj_  and  fy  be  as  in  Theorem  3.1.  Then 

(1.)  p |_*(«),  py*(«),  and  Cy*(a)  increase  monotonically  and  tend  to 

infinity  with  or. 

(2.)  If  there  exists  an  My  > 0 such  that  *y<p)1/l:>  £ My  for  all  p,  then 
lim  infgjQQ  hy*(e)  > 0 if  e/py*(«)  is  bounded  as  aToo. 

(3.)  If  there  exists  an  M^  > 0 such  that  Z M^  for  all  p,  then 

lim  inf^Qj  h^a)  > 0 only  if  «/pj_*(a)  is  bounded  as  «Too. 

Proof:  To  prove  (1.),  it  suffices  to  apply  (1.)  and  (2.)  of  Theorem  2.2  to  p^*  and 
Cy*,  and  to  py*  and  Cy*.  The  proof  of  (2.)  and  (3.)  is  similar  to  the  proof  of  (3.)  in 
Theorem  2.2.  | 

Note  that  (1.)  in  Theorem  3.3  does  not  state  how  p*(«)  varies  with  cr,  as  we  have 
pointed  out  above,  no  statement  about  p*(a)  may  be  obtained  from  the  information 
available.  However,  it  is  easy  to  see  that  C*(«)  increases  monotonically  with  a and  that 
''m«Tooc*(a>  " +°°  • 

Thus,  we  have  extended  the  optimality  theory  of  Section  2 to  a more  realistic 
situation.  In  Werschulz  [76a],  [76b],  the  techniques  of  this  section  are  applied  to  some 
important  basic  sequences  of  one-step  methods:  we  will  see  that  the  conclusions  of 
Lemma  3.1  and  Theorems  3.2  and  3.3  hold  for  these  basic  sequences. 
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4.  Normality  and  Order-Convergence 

Let  ♦ be  a basic  sequence  with  respect  to  the  error  measure  a;  we  say  that  • is 
order-convergent  if  there  exists  an  hQ  > 0 such  that 

(4.1)  ,imptoo  *u(p)  hP  " 0 for  h i hQ  . 

Clearly,  the  order  convergence  of  ♦ implies  that  limp-j^  e(*p,h)  - 0 for  h S hQ.  We 
use  the  term  "order-convergence"  rather  than  "convergence,"  since  the  latter  term 
appears  extensively  in  the  literature  (e.g.,  Henrici  [62])  and  is  always  used  to  mean  a 
"step-size  convergence,"  i.e.,  lim^Q  e(^,h)  »*  0 for  a fixed  method 

It  is  intuitively  plausible  that  as  the  order  of  an  approximation  increases,  the 
approximation  should  improve,  especially  when  one  is  trying  to  approximate  a very 
smooth  function.  Unfortunately,  Gear  [71]  points  out  that  an  increase  in  order  need 
not  always  decrease  the  error.  This  situation  appears  in  other  situations  in  numerical 
mathematics}  for  instance,  the  family  of  Newton-Cotes  quadrature  formulae  is  riot 
order -convergent.  But  suppose  there  exists  a step-size  hp  > 0 for  which  the  upper- 
bound  error  is  exponentially  bounded  for  p sufficiently  large;  that  is,  there  exists 
A > 0 and  pq  ( Z ++  such  that 

(4.2)  «y(p)  h0P  - AP  ,or  P > PO  • 

If  we  define 

My  max  { max1SpSpo  (xyfp)1^}  , Ah0-1}  , 

we  then  have 

(4.3)  <r(*p,h)  S (Myh)P  for  h s h0  , p < Z ++  . 

Note  that  the  bound  in  (4.3)  is  similar  to  that  given  by  Cauchy’s  Integral  Theorem 
(Ahlfors  [66],  pg.  122)  on  the  normalized  derivatives  of  an  analytic  function.  In  fact, 
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for  several  classes  of  methods,  the  bound  (4.3)  holds  whenever  the  solution  of  (2.1)  it 
analytic. 

We  also  formalize  a weakened  version  of  (4.3),  which  will  be  important  in  our 
study  of  one-step  methods.  Let  * be  a basic  sequence,  and  suppose  that  for  each 
(xq,v)  < 5)xT?,  there  is  a sequence  {hp:  p < Z ++}  c I and  a positive  constant  My  such 
that 

(4.4)  *(*p,h)  S (Myh)P  if  h < hp  ; 

then  $ is  said  to  be  normal.  Note  that  (4.3)  implies  (4.4),  while  (4.4)  implies  (4.3)  only 
when  the  sequence  {hp}  has  non-vanishing  support: 

(4.5)  h#  lim  infp^  hp  > 0 . 

If  h^  - 0,  normality  gives  an  exponential  upper  bound  on  the  sequence  of  principal 
error  functions  (Section  3.3-5  of  Henrici  [62]),  which  are  an  asymptotic  measure  of  the 
error  as  h 1 0. 

There  is  a simple  relation  between  normality  and  order-convergence. 

Proposition  4.1:  ♦ is  order -convergent  if  and  only  if  ♦ is  normal  with 
nonvanishing  support. 

Proof:  If  (4.1)  holds,  then  (in  particular)  we  have  limp-j^  «y(p)  hgp  - 0,  so  that 
*y(p)  hgP  < 1 for  p sufficiently  large;  i.e.,  (4.2)  holds  with  A - 1.  Then  (as  in  the 
discussion  above)  (4.3)  holds,  implying  normality  with  finite  support. 

Conversely,  if  (4.4)  holds  with  finite  support,  we  pick  a positive  hg  which  is  less 

than 

n min  {My'1,  inf  {ty  p ( I**}  } . 

(Note  that  n > 0 by  (4.5).)  Let  h s hg  be  given,  so  that  for  some  I with  0 < I < 1,  we 
have  h - (1  - t hi  if  we  define  «y  by 


«l/p)  MyP  * 
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we  find  (since  h < hp)  that 

<r(*p,h)  s (Myh)P  . 

Thus 

«y(p)  hP  - (Myh)P  - (My  (1  - «) i,)P  S (1  - «)P 
(the  last  step  since  «>  s My-1),  so  that  (4.1)  holds.  | 

We  are  now  interested  in  normality  and  order-convergence  for  a specific  error 
measure  e;  we  will  be  interested  in  > *|_ ' anc*  *G  » w6ich  are  (respectively)  defined 
to  be  the  maximum  local  error  per  unit  step,  local  error  per  step,  and  global  error  per 
step  over  the  grid.  It  is  easy  to  see  that  a normal  (order-convergent)  sequence  ♦ - 
P « 2 ++}  with  respect  to  ff|_  naturally  yields  a normal  (order-convergent) 
sequence  ♦ *•  {^p:  p(  2 ++}  with  respect  to  ^y  by  setting  ^p  :«  ^p+^  for  p < Z ++. 
We  now  look  at  the  relationships  between  e^y  and  eg. 

Proposition  4.2;  Let  v have  Lipschitz  constant  K on  IR^,  and  let  ♦ be  normal 
(respectively,  order -convergent)  with  respect  to  ffLU>  in  (4.4)  independent  of 

xq  < domain(v).  Then  ♦ is  normal  (respectively,  order -convergent)  with  respect  to  »q. 

Proof;  Let  p be  the  exact  relative  increment  function  of  (2.1)  (as  defined  in 
Henrici  [62]),  so  that 

x(tj+1)  * x(tj)  + h p(x(tj),  h ) . 

Subtract  (2.2)  (with  <p  replaced  by  ^p)  from  the  above  to  get 

ei+l  ” ei  + h [pM*j).h)  " ^p<Xj,h)3 , 
where  ej  :■  x(tj)  - Xj  for  0 < i £ n.  Thus 


Ikj+lll  ^ Ikjll  + h ||p(x(tj),h)  - p(xj,h)||  + h ||p(xj,h)  - *>p(Xj,h)|| 

S (1  + hK)  ||o,||  + MyP  hP+1  if  h s hp  ; 

this  last  step  follows  from  the  Lipschitz  condition  and  the  "uniform"  normality  with 
respect  to  »lu-  By  Lemma  1.2  of  Henrici  [62]  and  the  condition  6q  - 0,  we  have 
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||»j||  S K"1  [<1  +hK)'  - ^(MuhjP 
S K-1  [<1  + hK)n  - l](Mijh)P 
S K'1  (eK  - 1)  (Muh)P 

for  all  is  this  gives 

*G(*plh)  S K"1  (eK  - 1)  (Mjjh)P  S (Mh)P  if  h s hp, 
for  a suitably-defined  M > 0.  This  proves  the  normality  part;  the  remainder  of  the 
result  follows  from  Proposition  4.1.  | 

If  it  is  undesirable  to  use  the  "uniform  normality"  (i.e.,  the  condition  that  My  be 
independent  of  xq  < domain(v)  in  (4.4)),  we  may  use  the  following  result. 

Proposition  4.3:  Let  v be  Lipschitz  continuous,  let  ♦ be  normal  (respectively, 
order-convergent)  with  respect  to  and  suppose  that  there  exists  a X > 0 such  that 
for  all  tPp  < # and  all  x,  y < IR^, 

||^p(x)  - tpp(y)||  S X p ||x  - y||  . 

Then  # is  normal  (respectively,  order -convergent)  with  respect  to  eg. 

Proof:  Immediate  from  Theorem  3.3  of  Henrici  [62] . | 

Thus  normality  for  follows  from  normality  for  , a Lipschitz  condition  on  v 
and  the  elements  of  ♦,  and  a linear  upper  bound  on  the  Lipschitz  constants  for  the 
elements  of  4. 

We  now  , discuss  the  problem  of  finding  uniform  lower  bounds  on  the  error  which 
are  similar  to  the  uniform  upper  bounds  which  normality  provides.  This  will  amount  to 
a restriction  of  the  admissible  problem  class  S)x7!)  so  as  to  guarantee  that  the 
problems  are  "sufficiently  difficult."  However,  this  restriction  may  be  abandoned  if  we 
are  interested  only  in  upper  bounds.  We  shall  assume  throughout  thg.  rest  of  this 
section  that  there  is  an  Ml  > 0 (which  will  generally  depend  on  ♦,  t,  and  the  problem 
(xq,v))  such  that 
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(4.6)  *(*p,h)  z (k^h)?  for  h « l . 

Note  that  (4.6)  will  hold  for  any  situation  in  which  there  is  no  order -convergence,  or  in 
which  the  order-convergence  (if  any)  is  no  faster  than  an  exponential  decay; 
moreover,  in  the  methods  we  consider  in  Sections  5 and  6,  (4.6)  is  a consequence  of 
the  assumption  that  all  derivatives  assume  the  (sharp)  worst -case  upper  bound 
provided  by  Cauchy's  estimate.  It  is  clear  that  if  (4.6)  holds  for  , it  holds  for  »yj  ; 
if  (4.6)  holds  for  anc*  ^e  gradient  matrix  Vp  has  only  non-negative  entries  (with 
at  least  one  positive  entry),  then  (4.6)  holds  for  *q  . 

It  is  possible  to  present  a simplified  version  of  the  expressions  derived  in 
Section  3,  under  the  assumption  that  ♦ is  order-convergent.  We  first  look  at  the 
complexity  of  a single -method  within  an  order-convergent  basic  sequence. 

Theorem  4.1:  Let  ♦ be  order-convergent  with  respect  to  #.  Then 
CL(p,«)  i C(p,a)  S C|j(p,a) , 

where 

CL(p,a)  :«  Ml  cL(p)  e*/P  and  Cyfp,*)  :•  k^  c^p)  e°/p  . 


Proof:  This  is  an  immediate  corollary  of  Theorem  3.1  and  the  definition  of 


order -convergence.  | 

We  may  now  do  the  optimality  theory  of  Section  3,  finding  that 
(4.7)  G|_(p)  - p2  ^'(pl/c^p}  and  Gy(p>  - P2  Cu^pl/c^p)  . 

Note  that  the  assumptions  (2.10)  and  (2.11)  now  state  that  C[_(p)  and  Cy(p)  must  be 
positive  for  p > 0 and  tend  to  infinity  with  p,  which  is  a natural  way  to  expect  the  cost 
per  step  to  behave.  The  results  stated  in  Theorems  3.2  and  3.3  hold  as  before. 
Moreover,  it  should  be  noted  that  the  and  M|_  needed  in  (2.)  and  (3.)  in  the 
statement  of  Theorem  3.3  are  precisely  the  Mjj  and  Mj_  in  (4.4)  and  (4.6).  Thus 
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lim  inf^gj  hy*(«)  > 0 if  «/py*(«)  is  bounded  as  « t oo,  and  «r/pj_*(«)  is  bounded  as 
« t oo  if  lim  infaf  m h|_*(«)  > 0. 

Thus,  the  order-convergence  of  a basic  sequence  is  useful  in  simplifying  the 
analysis  of  its  complexity.  Of  the  three  basic  sequences  studied  in 
Werschulz  [76a],  [76b],  two  are  known  to  be  order -convergent.  The  proof  of  the 
order -convergence  of  the  class  of  Taylor  series  methods  is  a simple  consequence  of 
the  Cauchy  estimate;  that  of  the  order-convergence  of  the  (non-optimally  ordered) 
nonlinear  Brent-Runge-Kutta  methods  (based  on  the  iterative  methods  defined  in 
pp.  4-7  of  Brent  [74]}  involves  using  some  classical  results  on  orthogonal  polynomials 
to  sharpen  the  proofs  in  Brent  [74]  . We  note  that  it  is  not  known  whether  the 
optimally-ordered  nonlinear  Brent-Runge-Kutta  methods  (based  on  the  iterative 
methods  defined  in  pp.  10-13  of  Brent  [74])  are  order -convergent;  it  does  appear 
likely  that  they  are  normal  with  vanishing  support.  However,  we  do  not  pursue  this 
class  of  methods  in  Werschulz  [76b],  because  of  their  high  combinatory  cost. 

It  is  not  known  whether  the  linear  Runge-Kutta  methods  found  in  Coope'  [65] 
and  in  Cooper  and  Verner  [72]  are  order-convergent;  the  best  result  known  is  the 
(My  log(p+e))p  result  given  in  Werschulz  [76a],  which  involves  strengthening  the 
original  proof  with  other  estimates  from  the  theory  of  orthogonal  polynomials.  But  it 
should  be  pointed  out  that  there  does  exist  a class  of  order-convergent  linear  Runge- 
Kutta  methods;  this  is  the  sequence  given  by  using  the  weights  and  abscissae  for 
Gauss  quadrature  in  the  methods  defined  on  page  144  of  Stetter  [73].  The  problem 
with  this  class  of  methods  is  that  each  step  of  requires  2"P(p+l)!  function 
evaluations;  the  prohibitive  cost  per  step  outweighs  by  far  any  advantage  to  be  gained 
from  the  order-convergence.  Thus,  the  question  of  whether  there  exist  any  order 
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5.  Numerical  Results 


In  the  previous  sections,  we  showed  that  the  optimal  order  increases  as  the 
error  criterion  s decreases,  tending  to  infinity  as  t tends  to  zero.  Here,  we  consider 
actual  numerical  results  of  optimal  order  and  minimal  cost  for  various  test  problems 
and  classes  of  methods;  these  results  show  that  the  optimal  order  does  indeed  exhibit 
the  behavior  indicated.  (The  optimal  order  for  a given  error  criterion  was  determined 
by  finding,  for  each  method  implemented,  the  coarsest  mesh  that  allowed  the  error 
criterion  to  be  satisfied;  the  resulting  complexities  were  then  compared  to  determine 
the  optimal  order.)  The  error  measure  used  was  the  "endpoint  error,"  i.e.,  the  co-norm 
(see  e.g.,  Stewart  [73],  pg.  164)  of  the  difference  between  the  true  and  computed 
solutions,  evaluated  at  the  endpoint  of  the  interval  of  interest  (the  unit  interval  I).  All 
testing  was  carried  out  on  the  Carnegie-Mellon  University  Computer  Science 
Department’s  PDP-10  in  ALGOL  and  FORTRAN,  using  double  precision. 

The  first  problems  considered  were  of  the  form 

(5.1)  x(t)  - X x(t)  x(0)  - 1 

on  the  unit  interval  I.  Although  this  problem  is  easy  to  handle  analytically,  any  general 
problem  of  the  form  (2.1)  may  be  locally  approximated  by  a linear  system  of  ordinary 
differential  equations  (see  e.g.,  Hindmarsh  [74],  pp.  17-18).  If  the  coefficient  matrix  of 
this  linear  system  is  diagonalizable,  an  uncoupled  set  of  scalar  equations  of  the  form 
(5.1)  will  result. 


These  problems  were  solved  via  Taylor  series  methods;  the  optimal  order  is 
given  in  Table  5.1  for  the  choices  of  X indicated.  Here  the  optimal  order  was  taken  to 
be  that  order  which  minimizes  the  number  of  evaluations  of  the  right-hand  side  of  (5.1) 
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required  to  attain  the  desired  error  criterion.  As  expected,  the  number  of  evaluations 
required  increases  as  the  error  criterion  « decreases.  Moreover,  the  optimal  order 
also  increases  monotonically  as  « decreases,  just  as  the  theory  predicts. 

We  next  turn  to  the  solution  of  the  test  problem 

(5.2)  *<t)  - cos2x(t)  x(0)  - 0 . 

For  this  problem,  we  searched  for  the  optimum  "unmodified"  Brent-Runge-Kutta 
method.  For  this  problem,  the  optimal  order  was  taken  to  be  that  for  which  the  actual 
CPU  time  (in  milliseconds)  required  to  solve  the  problem  to  within  a given  • was 
minimized.  Since  there  is  a certain  amount  of  randomness  in  such  a measure,  the  mean 
time  for  ten  runs  was  analyzed.  Not  surprisingly,  it  turned  out  that  the  order  which 
minimized  the  CPU  time  also  minimized  the  number  of  evaluations  of  the  right-hand 
side  of  (5.2).  Since  the  (n  + 2),h-order  method  requires  the  zeros  of  the  Jacobi 
polynomial  Gn(2,  2,  • ),  and  the  best  set  of  values  available  only  contained  the  zeros 
for  1 s n S 8 (Table  25.8  of  Abramowitz  and  Stegun  [64]),  only  the  methods  of  order 
not  exceeding  ten  were  implemented. 

The  results  for  problem  (5.2)  are  given  in  Table  5.2.  Here,  the  optimal  order  p*, 

the  optimal  number  of  mesh  points  n*,  the  minimal  number  of  evaluations  C*,  and  the 

© 

minimal  mean  CPU  time  C*  are  given.  Note  that  these  all  behave  as  predicted.  In 
addition,  we  computed  the  ratio  of  the  mean  CPU  time  for  a fourth-order  method 
C*(4,  • ) to  the  minimal  mean  runtime.  As  the  theory  predicts,  this  ratio  appears  to  be 
increasing  without  bound  as  i tends  to  zero.  (The  same  behavior  was  found  for  the 
ratio  Ce(4,  • ) / C*  , where  Ce(4,  • ) is  the  number  of  evaluations  required  by  a fourth- 
order  method.) 

Finally,  we  looked  at  the  "hard"  problem 

*j(t)  - «jj(»)  x,(t)  xj(t)  (1  S i S 2) 


(5.3) 
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«jj(t)  - irjj  exp(-rj|  (t  - r)2)  t"1  sin  t dr  (1  S i,  J i 2) 
(where  "exp"  denotes  the  exponential  function),  with  initial  conditions 

Xj(O)  - x2(0)  - 1 . 

The  tjj  were  all  taken  to  be  one,  while  the  r,j  were  taken  to  be 

•'ll  ■ 1 » *12  ” r22  * » *21  " 

(This  system  of  differential  equations  is  similar  to  the  system  governing  a two-species 

gas  chemical  reaction;  see  e.g.,  Finlayson  [71  J) 

Since  the  system  (5.3)  is  nonscalar  and  nonautonomous,  the  Brent-Runge-Kutta 

methods  are  not  appropriate.  Since  the  derivatives  of  Xj(t)  are  not  readily  available, 

the  Taylor  series  methods  are  not  particularly  easy  to  apply.  Thus  we  used  linear 

Runge-Kutta  methods  for  the  solution  of  (5.3).  The  particular  methods  RKp  of  order  p 

(1  S p S 8)  used  were  as  follows. 

RK1  ...  Euler’s  method 

RK2  ...  Ralston  [66]  (5.6-40)  "modified  Euler" 

RK3  ...  Ralston  [66]  (5.6-45) 

RK4  . . . Ralston  [66]  (5.6-48)  "classical  method" 

RK5  . . . Cassify  [66] 

RK6  . . . Butcher  [64]  (first  method  on  page  192) 

RK7  ...  Shanks  [66] 

RK8  . . . Cooper  and  Verner  [72] 

The  methods  of  order  less  than  eight  have  the  optimal  number  of  stages  per  step, 
while  the  method  of  Cooper  and  Verner  has  the  minimum  number  of  stages  of  all 
eighth-order  methods  known. 

Most  of  the  work  involved  in  solving  (5.3)  was  in  evaluating  *(j(t).  An  obvious 
change  of  variable  reduces  this  to  a Gauss-Hermite  quadrature;  a twenty-point 
quadrature  (Table  25.10  of  Abramowitz  and  Stegun  [64])  was  used  for  maximal 
accuracy.  The  Chebyshev  rational  function  approximation  given  on  page  356  of 
Frbberg  [69]  was  used  to  compute  (sin  r)  / r for  |r|  i 1;  the  system  double-precision 
sine  routine  was  used  for  Irl  > 1 . 


- 
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Since  so  much  of  the  time  required  to  solve  (5.3)  was  spent  in  evaluating  «jj(t), 

the  measure  of  cost  was  the  number  of  evaluations  of  the  set  {«|j(t) : 1 £ i,  j £ 2}  i 

that  is,  we  measured  the  number  of  evaluations  of  the  (vector)  right-hand  side  of  (5.3). 

(Moreover,  the  amount  of  computer  time  required  to  search  for  the  optimum  was  so 

great  as  to  preclude  running  the  problem  a large  number  of  times  and  averaging  the 

results,  as  was  done  in  the  previous  example.)  Results  are  given  in  Table  5.3,  where 

p*,  n*,  and  C*  (defined  as  for  (5.2))  are  given  as  a function  of  the  error  criterion.  The 
c 

table  stops  at  « - 10"^,  since  the  eighth-order  method  (i.e.,  the  highest-order  method 
implemented  for  testing)  was  reached  at  that  level.  Again,  note  that  the  theoretical 
results  predicted  are  confirmed  in  this  difficult  example. 

So,  our  three  numerical  examples  yield  data  which  agree  with  the  theoretical 
result  that  the  optimal  order  p*(a)  increases  with  « - In  *"\  Moreover,  in 
Werschulz  [76a],  [76b],  we  show  that  p*(«)  - 0<«)  as  a t oo  for  these  classes  of 
methods;  i.e.,  the  optimal  order  increases  linearly  with  a.  The  data  in  Tables  5.1-5.3 
support  this  result. 
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TABLE  5.1 

Taylor  Series  Methods  for  Test  Problem 
ft(t)  - X x(t)  x(0)  - 1 


-log10« 

X ■ -e 

X ■ -1 

X - -1/e 

X-  1/e 

X-  1 

X - e 

1 

2 

3 

1 

1 

3 

8 

2 

9 

4 

2 

2 

4 

9 

3 

11 

6 

3 

3 

6 

11 

4 

12 

7 

4 

4 

7 

12 

5 

14 

8 

5 

5 

8 

14 

G 

15 

9 

G 

G 

9 

15 

7 

16 

10 

7 

7 

10 

16 

8 

17 

11 

8 

8 

11 

18 

9 

19 

12 

9 

9 

12 

19 

Notes: 

1.  In  all  cases  except  X - -e,  • ■ 10"*,  the  optimal  mesh-size  was 
h ■ 1.0;  for  this  exceptional  case,  it  was  h » 0.5  . 

2.  Entry  in  table  is  the  optimal  order  for  the  given  X and  «.  This  equals 
the  minimal  number  of  function  evaluations  required  to  solve  the 
problem  on  the  entire  unit  interval,  except  for  the  exceptional  case 
noted  above,  where  four  was  the  minimal  number  of  evaluations. 
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TABLE  5.3 

Linear  Runge-Kutta  Methods  for  Test  Problem 
*<t)  - «jj(t)  Xj(t)  xj(t)  Xj(0)  - 1 (1  $ i 5 2) 

«jj(t)  - >jj  exp(-i»jj  (t  - e)2)  r"1  sin  e dr  (1  & i,  j S 2) 


-log  10* 

H 

n 

«S 

1 

3 

— 

8 

24 

2 

4 

le 

40 

3 

4 

15 

60 

4 

7 

9 

81 

5 

8 

9 

99 
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